Model Selection

Common Voice Dataset

# Common Voice Dataset

Whisper Kurmanji

An automatic speech recognition model for the Kurdish Kurmanji dialect, fine-tuned based on the Whisper architecture.

Speech Recognition

Safetensors Other

Vlzcrz Whisper Small Japanese 2

A Japanese speech recognition model fine-tuned on the Common Voice 17.0 dataset based on openai/whisper-small

Speech Recognition

Transformers Japanese

Whisper Large V3 Japanese 4k Steps

A speech recognition model fine-tuned on the Common Voice 16.1 Japanese dataset based on openai/whisper-large-v3, trained for 4000 steps

Speech Recognition

Transformers Japanese

Tts Thai Last Step

This is a Thai text-to-speech model based on the Tacotron2 architecture, trained using a modified Common Voice Thai dataset, with processed speech that does not retain original speaker characteristics.

Speech Synthesis Other

A Thai text-to-speech model based on the Tacotron2 architecture, trained using a modified Common Voice Thai dataset

Speech Synthesis Other

Exp W2v2t Zh Cn Wavlm S596

A Chinese speech recognition model fine-tuned based on microsoft/wavlm-large, supporting Simplified Chinese, trained using the Common Voice 7.0 (zh-CN) dataset.

Speech Recognition

Exp W2v2t Ja Xlsr 53 S109

Japanese automatic speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, trained using Common Voice 7.0 Japanese dataset

Speech Recognition

Transformers Japanese

Wav2vec2 Large Xls R 300m Hindi Epochs15 Colab

This is a speech recognition model fine-tuned on the Common Voice dataset based on the facebook/wav2vec2-xls-r-300m model, supporting Hindi.

Speech Recognition

Wav2vec2 Common Voice Tr Demo Dist

This model is an automatic speech recognition (ASR) model fine-tuned on the COMMON_VOICE - TR Turkish dataset based on facebook/wav2vec2-large-xlsr-53, achieving a word error rate of 0.3242 on the evaluation set.

Speech Recognition

Transformers Other

Wav2vec2 Large Xls R 300m Turkish Colab Common Voice 8 4

This model is a speech recognition model fine-tuned on the Common Voice Turkish dataset, based on Facebook's wav2vec2-xls-r-300m model.

Speech Recognition

Wav2vec2 Large Xls R 300m German With Lm

A speech recognition model fine-tuned on the Common Voice German dataset based on facebook/wav2vec2-xls-r-300m, integrated with an n-gram language model, achieving a word error rate of 8.8%

Speech Recognition

Wav2vec2 Base Cv 10000

A speech recognition model fine-tuned on the Common Voice dataset based on wav2vec2-base-cv, achieving a word error rate of 36.84% on the evaluation set.

Speech Recognition

Wav2vec2 Base Checkpoint 12

This model is a fine-tuned version based on wav2vec2-base-checkpoint-11.1 on the Common Voice dataset, primarily used for speech recognition tasks.

Speech Recognition

Wav2vec2 Xlsr Romansh Sursilvan

This model is an automatic speech recognition model fine-tuned on the Romansh-Sursilvan dialect dataset based on facebook/wav2vec2-xls-r-1b, achieving a word error rate (WER) of 13.82% on the Common Voice 8 test set.

Speech Recognition

Wav2vec2 Large Xls R 300m Hausa

This is an automatic speech recognition model fine-tuned on Hausa speech data based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

Wav2vec2 Xls R 300m Zh TW

This is a Chinese-Taiwan speech recognition model fine-tuned on the COMMON_VOICE - ZH-TW dataset based on the facebook/wav2vec2-xls-r-300m model

Speech Recognition

Wav2vec2 Base Checkpoint 14

A speech recognition model based on the wav2vec2 architecture, fine-tuned on the Common Voice dataset

Speech Recognition

This is a small random robustness model based on the wav2vec2 architecture, fine-tuned on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 - AB dataset for automatic speech recognition tasks.

Speech Recognition

Transformers Other

Tts Transformer Ar Cv7

A Transformer-based text-to-speech model using fairseq S^2, supporting Arabic single male speaker synthesis

Speech Synthesis Arabic

Wav2vec2 Xlsr Multilingual 56

This is a multilingual automatic speech recognition (ASR) model supporting 56 languages, fine-tuned from facebook/wav2vec2-large-xlsr-53 on the Common Voice dataset.

Speech Recognition

Transformers Supports Multiple Languages

Wav2vec2 Base Turkish Cv7

Turkish automatic speech recognition model based on wav2vec2 architecture, fine-tuned on the Common Voice 7.0 Turkish dataset

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Dutch

An automatic speech recognition model fine-tuned on the Dutch Common Voice dataset based on facebook/wav2vec2-large-xlsr-53, achieving a test WER of 17.09%.

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr Persian V2

An automatic speech recognition model fine-tuned on Persian (Farsi) using the Common Voice dataset, based on facebook/wav2vec2-large-xlsr-53

Speech Recognition Other

Wav2vec2 Hausa2 Demo Colab

This model is a Hausa speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Wav2vec2 Xls R 300m Arabic

This is an automatic speech recognition (ASR) model fine-tuned on the Arabic Common Voice 7 dataset based on the facebook/wav2vec2-xls-r-300m model.

Speech Recognition

Transformers Arabic

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase